CommanderSong: A Systematic Approach for Practical Adversarial Voice Recognition
نویسندگان
چکیده
The popularity of ASR (automatic speech recognition) systems, like Google Voice, Cortana, brings in security concerns, as demonstrated by recent attacks. The impacts of such threats, however, are less clear, since they are either less stealthy (producing noise-like voice commands) or requiring the physical presence of an attack device (using ultrasound). In this paper, we demonstrate that not only are more practical and surreptitious attacks feasible but they can even be automatically constructed. Specifically, we find that the voice commands can be stealthily embedded into songs, which, when played, can effectively control the target system through ASR without being noticed. For this purpose, we developed novel techniques that address a key technical challenge: integrating the commands into a song in a way that can be effectively recognized by ASR through the air, in the presence of background noise, while not being detected by a human listener. Our research shows that this can be done automatically against real world ASR applications1. We also demonstrate that such CommanderSongs can be spread through Internet (e.g., YouTube) and radio, potentially affecting millions of ASR users. We further present a new mitigation technique that controls this threat.
منابع مشابه
Voice-based Age and Gender Recognition using Training Generative Sparse Model
Abstract: Gender recognition and age detection are important problems in telephone speech processing to investigate the identity of an individual using voice characteristics. In this paper a new gender and age recognition system is introduced based on generative incoherent models learned using sparse non-negative matrix factorization and atom correction post-processing method. Similar to genera...
متن کاملFast Feature Fool: A data independent approach to universal adversarial perturbations
State-of-the-art object recognition Convolutional Neural Networks (CNNs) are shown to be fooled by image agnostic perturbations, called universal adversarial perturbations. It is also observed that these perturbations generalize across multiple networks trained on the same target data. However, these algorithms require training data on which the CNNs were trained and compute adversarial perturb...
متن کاملIdentifying Security Evaluation of Pattern Classifiers Under attack
Pattern classification is a branch of machine learning that focuses on recognition of patterns and regularities in data. In adversarial applications like biometric authentication, spam filtering, network intrusion detection the pattern classification systems are used. As this adversarial scenario is not taken into account by classical design methods, pattern classification systems may exhibit v...
متن کاملCrafting Adversarial Examples For Speech Paralinguistics Applications
Computational paralinguistic analysis is increasingly being used in a wide range of applications, including securitysensitive applications such as speaker verification, deceptive speech detection, and medical diagnostics. While state-ofthe-art machine learning techniques, such as deep neural networks, can provide robust and accurate speech analysis, they are susceptible to adversarial attacks. ...
متن کاملVoice Signatures
Most current spoken-dialog systems only extract sequences of words from a speaker’s voice. This largely ignores other useful information that can be inferred from speech such as gender, age, dialect, or emotion. These characteristics of a speaker’s voice, voice signatures, whether static or dynamic, can be useful for speech mining applications or for the design of a natural spoken-dialog system...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1801.08535 شماره
صفحات -
تاریخ انتشار 2018